Constraint Relaxations for Discovering Unknown Sequential Patterns
نویسندگان
چکیده
The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compromising the discovery of unknown patterns. Our methodology is based on the use of constraint relaxations, and it consists on using them to filter accepted patterns during the mining process. We propose a hierarchy of relaxations, applied to constraints expressed as context-free languages, classifying the existing relaxations (legal, valid and naïve, previously proposed), and proposing several new classes of relaxations. The new classes range from the approx and non-accepted, to the composition of different types of relaxations, like the approx-legal or the nonprefix-valid relaxations. Finally, we present a case study that shows the results achieved with the application of this methodology to the analysis of the curricular sequences of computer science students.
منابع مشابه
Mining Patterns Using Relaxations of User Defined Constraints
The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a hypothesis-testing task. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compro...
متن کاملConstraint-based sequential pattern mining: a pattern growth algorithm incorporating compactness, length and monetary
Sequential pattern mining is advantageous for several applications for example, it finds out the sequential purchasing behavior of majority customers from a large number of customer transactions. However, the existing researches in the field of discovering sequential patterns are based on the concept of frequency and presume that the customer purchasing behavior sequences do not fluctuate with ...
متن کاملDiscovering Active and Profitable Patterns with Rfm (recency, Frequency and Monetary) Sequential Pattern Mining–a Constraint Based Approach
Sequential pattern mining is an extension of association rule mining that discovers time-related behaviors in sequence database. It extends association by adding time to the transactions. The problem of finding association rules concern with intratransaction patterns whereas that of sequential pattern mining concerns with inter-transaction patterns. Generalized Sequential Pattern (GSP) mining a...
متن کاملMining Constraint-based Multidimensional Frequent Sequential Pattern in Web Logs
In this paper we introduce an efficient strategy for discovering Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in ...
متن کاملWildSpan: Efficient Discovery of Functional Motifs Spanning Large Wildcard Regions from Protein Sequences
Motivation: Automatic extraction of motifs from biological sequences is an important problem in molecular biology. For proteins, it is desired to discover sequence motifs containing large irregular gaps as the contact residues associated with a functional site are not always from one region of the sequences. Discovering such patterns is a time-consuming task due to a large number of combination...
متن کامل